Bioinformatics (Thomas Dandekar, Meik Kunz)

339

and usually occurs indirectly, e.g. via glucocorticoids, and is often also associated with

intracellular communication. An example of cellular communication is second messen

gers that allow rapid communication, such as ATP in the energy supply in the cell (ATP is

critically important for movement). It is generated in the respiratory chain after energy-

rich compounds are broken down via glycolysis (anaerobic) and citric acid cycle (aerobic).

The reduction equivalents (NADH, FADH) are oxidized in the respiratory chain and

assembled into ATP molecules. Bioinformatically, I can look at metabolism and develop a

kinetic (dynamic) model for this. Another example of cellular communication is differen

tiation, which is cell-to-cell communication. Here, for example, haematopoiesis (blood

formation) would be interesting. For this, one can bioinformatically look at the kinase

network. Important for cell differentiation is the central organizer (Speman organizer),

which determines the developmental axes in the embryo, which occurs via the Wnt signal

ing pathway. This can also be considered bioinformatically, e.g. modeling with cellular

automata or agent-based simulations. In most cases, it is therefore of interest to know the

role of my protein and where it is localised, for example in the membrane or in the cell

nucleus, in order to also draw conclusions about its function. For this purpose, there are

already numerous databases in which I can find relevant interactions and information, e.g.

PlateletWeb, KEGG, STRING and SPdb (Signal Peptide database; https://proline.bic.nus.

edu.sg/spdb/). Bioinformatically, I can also predict localization, for example with SignalP

(localization of signal peptides; https://www.cbs.dtu.dk/services/SignalP) or TargetP

(https://www.cbs.dtu.dk/services/TargetP). Given a training dataset of proteins with

known, experimentally verified localization, these programs learn to predict a particular

localization from the amino acid composition. The localization in the cell can thus be

determined from the protein sequence with the help of programs with hidden Markov

models or neuronal networks, and new sequences to be investigated can then be assigned

accordingly. Specifically, a transcription factor should be localised in the nucleus, an acid

protease in the lysosome, a storage protein in the Golgi, a secreted protein in the endoplas

mic reticulum and a membrane protein (prediction with TMHMM) in the membrane, and

so on. A program should also predict this accordingly. If you want to write your own pro

gram, it should have an input and output part. In the middle is the processing part (predic

tion part). This consists of either a neural network or a hidden Markov model.

The information content of a message can be described with the Shannon entropy: One

bit of information is the smallest unit of information, a “yes” or “no” decision. Words and

sentences can thus be assigned their information content according to their length. In a

further step, one can include the different signal sources and consider the quality, i.e. how

high or low the information value is, e.g. low if the same characters are always sent. This

knowledge can also be transferred to biological systems, for example if one wants to take

a bioinformatic look at cell differentiation or intracellular communication, such as a signal

cascade between body cells via second messengers (e.g. cAMP). In this way, signal trans

mission for cell growth and cell differentiation can be described in more detail, for exam

ple by amplification or attenuation of cellular signals by kinases and phosphatases (the

quality of the signal depends on the ratio of signal to background noise). In this way, it is

possible to observe and model various complex cellular processes bioinformatically. One

is thus in a position to understand them better.

20.11 Design Principles of a Cell